Preliminary Experiments with Geo-Filtering Predicates for Geographic IR

نویسنده

  • Jochen L. Leidner
چکیده

This paper describes a set of experiments for monolingual English retrieval at GEO-CLEF 2005. We evaluate a technique for spatial retrieval based on named entity tagging, toponym resolution, and re-ranking by means of geographic filtering. To this end, we present a series of systematic experiments in the Vector Space paradigm. We investigate plain bag-of-word versus a kind of phrasal retrieval, the potential of meronymic query expansion as a recall-enhancing device, and compare three alternative geo-spatial filtering techniques based on spatial clipping. We evaluate these on 25 monolingual English queries. Our preliminary results show that always choosing toponym referents based on a simple “maximum population” heuristic to approximate the salience of a referent fails to outperform TF*IDF baselines with the GEO-CLEF 2005 dataset when combined with three geo-filtering predicates. Conservative geo-filtering outperforms more aggressive predicates. The evidence further seems to suggest that query expansion with WordNet meronyms is not effective in combination with the method described. A cursory post-hoc analysis indicates that responsible factors for the low performance include sparseness of available population data, gaps in the gazetteer that associates Minimum Bounding Rectangles with geo-terms in the query, and the composition of the GEO-CLEF 2005 dataset itself.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Re-Ranking for Geo-Relevance With Non-Contextual Heuristics at GeoCLEF 2007

Geographic Information Retrieval (GIR) in an attempt to improve relevance by taking geographic information in textual documents into account. We describe out experiments carried out at the GeoCLEF 2007 evaluation [1] that investigate further the role of geo-filtering based re-ranking and query expansion with geographic terms. Our main findings are that manual query expansion with geo-terms is m...

متن کامل

NICTA I2D2 Group at GeoCLEF 2006

We report on the experiments undertaken by the NICTA I2D2 Group as part of GeoCLEF 2006, as well as post-GeoCLEF evaluations and improvements to the submitted system. In particular, we used techniques to assign probabilistic likelihoods to geographic candidates for each identified geo-term, and a probabilistic IR engine. A normalisation process that adjusts term weights, so as to prevent expand...

متن کامل

Evaluating Geographic Information Retrieval

The processing steps required for geographic information retrieval include many steps that are common to all forms of information retrieval, e.g. stopword filtering, stemming, vocabulary enrichment, understanding Booleans, and fluff removal. Only a few steps, in particular the detection of geographic entities and the assignment of bounding boxes to these, are specific to geographic IR. The pape...

متن کامل

A Geo-textual Search Engine Approach Assisting Disaster Recovery, Crisis Management and Early Warning Systems

1 : 1 This work has been funded by the Oberfrankenstiftung This paper presents an approach used in geo-textual search engines for application in security related domains like disaster recovery or early warning systems. Current approaches suffer from search conditions utilizing some combined scheme of textual and geographical search predicates. Standard retrieval engines support only either text...

متن کامل

Preliminary Beneficiation and Washability Studies on Ghouzlou's Low-Ash Coal Sample

In the present research work, a low-ash coal, from Ghouzlou deposit in Iran, with an average ash content of 12% was subjected to some beneficiation experiments such as heavy media separation and flotation. Sieve analysis showed that 62.3% of the coal sample with the size of +2 mm had around 7.3% ash contents. Also, heavy media tests carried out on five size fractions revealed that by setting th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005